Obtaining a Gold Standard Exercise

In this exercise, you'll be given labels for 60 mammograms that contain a suspicious mass. Anytime this occurs in a clinical setting, the patient is sent for a mass biopsy to determine if the mass is benign or cancerous. The radiologist can still make a judgment about whether to mass appears malignant or not based on how it appears in the image.

Sometimes in algorithmic development settings, we are only able to obtain radiologist reports and we are not able to obtain biopsy reports for all studies. Since the true gold standard label is the biopsy result, it helps to get several opinions from different radiologists on the image appearance to make a more robust ground truth assessment in the absence of biopsy data.

Here, you are provided with labels from three different radiologists who have the following levels of clinical experience:

Rad1 = 5 years
Rad2 = 10 years
Rad3 = 15 years

In this exercise, create three 'ground truths', you can label benign as 1 and malignant as 0:

Using biopsy labels (true gold standard)
Using a voting system between the three radiologists
Using a weighted voting system with experience levels between the three radiologists

Assess how 2 & 3 compare to 1: if the ground truths from 2 & 3 agree with 1.

Code

If you need a code on the https://github.com/udacity.

Next Concept